AITopics | channel attention

Collaborating Authors

channel attention

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Shallow Neural Networks Learn Low-Degree Spherical Polynomials with Learnable Channel Attention

Yang, Yingzhen

arXiv.org Machine LearningDec-24-2025

We study the problem of learning a low-degree spherical polynomial of degree $\ell_0 = Θ(1) \ge 1$ defined on the unit sphere in $\RR^d$ by training an over-parameterized two-layer neural network (NN) with channel attention in this paper. Our main result is the significantly improved sample complexity for learning such low-degree polynomials. We show that, for any regression risk $\eps \in (0,1)$, a carefully designed two-layer NN with channel attention and finite width of $m \ge Θ({n^4 \log (2n/δ)}/{d^{2\ell_0}})$ trained by the vanilla gradient descent (GD) requires the lowest sample complexity of $n \asymp Θ(d^{\ell_0}/\eps)$ with probability $1-δ$ for every $δ\in (0,1)$, in contrast with the representative sample complexity $Θ\pth{d^{\ell_0} \max\set{\eps^{-2},\log d}}$, where $n$ is the training daata size. Moreover, such sample complexity is not improvable since the trained network renders a sharp rate of the nonparametric regression risk of the order $Θ(d^{\ell_0}/{n})$ with probability at least $1-δ$. On the other hand, the minimax optimal rate for the regression risk with a kernel of rank $Θ(d^{\ell_0})$ is $Θ(d^{\ell_0}/{n})$, so that the rate of the nonparametric regression risk of the network trained by GD is minimax optimal. The training of the two-layer NN with channel attention consists of two stages. In Stage 1, a provable learnable channel selection algorithm identifies the ground-truth channel number $\ell_0$ from the initial $L \ge \ell_0$ channels in the first-layer activation, with high probability. This learnable selection is achieved by an efficient one-step GD update on both layers, enabling feature learning for low-degree polynomial targets. In Stage 2, the second layer is trained by standard GD using the activation function with the selected channels.

neural network, polynomial, probability, (15 more...)

arXiv.org Machine Learning

2512.20562

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(8 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SDE-Attention: Latent Attention in SDE-RNNs for Irregularly Sampled Time Series with Missing Data

Fang, Yuting, Gia, Qouc Le, Salim, Flora

arXiv.org Artificial IntelligenceDec-1-2025

Irregularly sampled time series with substantial missing observations are common in healthcare and sensor networks. We introduce SDE-Attention, a family of SDE-RNNs equipped with channel-level attention on the latent pre-RNN state, including channel recalibration, time-varying feature attention, and pyramidal multi-scale self-attention. We therefore conduct a comparison on a synthetic periodic dataset and real-world benchmarks, under varying missing rate. Latent-space attention consistently improves over a vanilla SDE-RNN. On the univariate UCR datasets, the LSTM-based time-varying feature model SDE-TVF-L achieves the highest average accuracy, raising mean performance by approximately 4, 6, and 10 percentage points over the baseline at 30%, 60% and 90% missingness, respectively (averaged across datasets). On multivariate UEA benchmarks, attention-augmented models again outperform the backbone, with SDE-TVF-L yielding up to a 7% gain in mean accuracy under high missingness. Among the proposed mechanisms, time-varying feature attention is the most robust on univariate datasets. On multivariate datasets, different attention types excel on different tasks, showing that SDE-Attention can be flexibly adapted to the structure of each problem.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.23238

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EGSA-PT:Edge-Guided Spatial Attention with Progressive Training for Monocular Depth Estimation and Segmentation of Transparent Objects

Omotara, Gbenga, Farag, Ramy, Tousi, Seyed Mohamad Ali, DeSouza, G. N.

arXiv.org Artificial IntelligenceNov-20-2025

Transparent object perception remains a major challenge in computer vision research, as transparency confounds both depth estimation and semantic segmentation. Recent work has explored multi-task learning frameworks to improve robustness, yet negative cross-task interactions often hinder performance. In this work, we introduce Edge-Guided Spatial Attention (EGSA), a fusion mechanism designed to mitigate destructive interactions by incorporating boundary information into the fusion between semantic and geometric features. On both Syn-TODD and ClearPose benchmarks, EGSA consistently improved depth accuracy over the current state of the art method (MODEST), while preserving competitive segmentation performance, with the largest improvements appearing in transparent regions. Besides our fusion design, our second contribution is a multi-modal progressive training strategy, where learning transitions from edges derived from RGB images to edges derived from predicted depth images. This approach allows the system to bootstrap learning from the rich textures contained in RGB images, and then switch to more relevant geometric content in depth maps, while it eliminates the need for ground-truth depth at training time. Together, these contributions highlight edge-guided fusion as a robust approach capable of improving transparent object perception.

apreprint-november20, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.1497

Country: North America > United States (0.16)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Auto Learning Attention Benteng Ma

Neural Information Processing SystemsOct-2-2025, 02:12:15 GMT

Attention modules have been demonstrated effective in strengthening the representation ability of a neural network via reweighting spatial or channel features or stacking both operations sequentially. However, designing the structures of different attention operations requires a bulk of computation and extensive expertise.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

WACA-UNet: Weakness-Aware Channel Attention for Static IR Drop Prediction in Integrated Circuit Design

Seo, Youngmin, Kwon, Yunhyeong, Park, Younghun, Kim, HwiRyong, Eum, Seungho, Kim, Jinha, Song, Taigon, Kim, Juho, Park, Unsang

arXiv.org Artificial IntelligenceJul-28-2025

Accurate spatial prediction of power integrity issues, such as IR drop, is critical for reliable VLSI design. However, traditional simulation-based solvers are computationally expensive and difficult to scale. We address this challenge by reformulating IR drop estimation as a pixel-wise regression task on heterogeneous multi-channel physical maps derived from circuit layouts. Prior learning-based methods treat all input layers (e.g., metal, via, and current maps) equally, ignoring their varying importance to prediction accuracy. To tackle this, we propose a novel Weakness-Aware Channel Attention (WACA) mechanism, which recursively enhances weak feature channels while suppressing over-dominant ones through a two-stage gating strategy. Integrated into a ConvNeXtV2-based attention U-Net, our approach enables adaptive and balanced feature representation. On the public ICCAD-2023 benchmark, our method outperforms the ICCAD-2023 contest winner by reducing mean absolute error by 61.1% and improving F1-score by 71.0%. These results demonstrate that channel-wise heterogeneity is a key inductive bias in physical layout analysis for VLSI.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2507.19197

Country: Asia > South Korea (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Semiconductors & Electronics (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

A Transformer-Based Conditional GAN with Multiple Instance Learning for UAV Signal Detection and Classification

Liu, Haochen, Bi, Jia, Wang, Xiaomin, Yang, Xin, Wang, Ling

arXiv.org Artificial IntelligenceJul-22-2025

Unmanned Aerial Vehicles (UAVs) are increasingly used in surveillance, logistics, agriculture, disaster management, and military operations. Accurate detection and classification of UAV flight states, such as hovering, cruising, ascending, or transitioning, which are essential for safe and effective operations. However, conventional time series classification (TSC) methods often lack robustness and generalization for dynamic UAV environments, while state of the art(SOTA) models like Transformers and LSTM based architectures typically require large datasets and entail high computational costs, especially with high-dimensional data streams. This paper proposes a novel framework that integrates a Transformer-based Generative Adversarial Network (GAN) with Multiple Instance Locally Explainable Learning (MILET) to address these challenges in UAV flight state classification. The Transformer encoder captures long-range temporal dependencies and complex telemetry dynamics, while the GAN module augments limited datasets with realistic synthetic samples. MIL is incorporated to focus attention on the most discriminative input segments, reducing noise and computational overhead. Experimental results show that the proposed method achieves superior accuracy 96.5% on the DroneDetect dataset and 98.6% on the DroneRF dataset that outperforming other SOTA approaches. The framework also demonstrates strong computational efficiency and robust generalization across diverse UAV platforms and flight states, highlighting its potential for real-time deployment in resource constrained environments.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.14592

Country:

North America > United States (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Robotics & Automation (0.74)
Aerospace & Defense > Aircraft (0.74)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Comparative Study of NAFNet Baselines for Image Restoration

Esaulov, Vladislav, Esfahani, M. Moein

arXiv.org Artificial IntelligenceJun-25-2025

We study NAFNet (Nonlinear Activation Free Network), a simple and efficient deep learning baseline for image restoration. By using CIFAR10 images corrupted with noise and blur, we conduct an ablation study of NAFNet's core components. Our baseline model implements SimpleGate activation, Simplified Channel Activation (SCA), and LayerNormalization. We compare this baseline to different variants that replace or remove components. Quantitative results (PSNR, SSIM) and examples illustrate how each modification affects restoration performance. Our findings support the NAFNet design: the SimpleGate and simplified attention mechanisms yield better results than conventional activations and attention, while LayerNorm proves to be important for stable training. We conclude with recommendations for model design, discuss potential improvements, and future work.

artificial intelligence, channel attention, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.19845

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States (0.05)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SAAT: Synergistic Alternating Aggregation Transformer for Image Super-Resolution

Wu, Jianfeng, Xu, Nannan

arXiv.org Artificial IntelligenceJun-5-2025

Single image super-resolution is a well-known downstream task which aims to restore low-resolution images into high-resolution images. At present, models based on Transformers have shone brightly in the field of super-resolution due to their ability to capture long-term dependencies in information. However, current methods typically compute self-attention in nonoverlapping windows to save computational costs, and the standard self-attention computation only focuses on its results, thereby neglecting the useful information across channels and the rich spatial structural information generated in the intermediate process. Channel attention and spatial attention have, respectively, brought significant improvements to various downstream visual tasks in terms of extracting feature dependency and spatial structure relationships, but the synergistic relationship between channel and spatial attention has not been fully explored yet.To address these issues, we propose a novel model. Synergistic Alternating Aggregation Transformer (SAAT), which can better utilize the potential information of features. In SAAT, we introduce the Efficient Channel & Window Synergistic Attention Group (CWSAG) and the Spatial & Window Synergistic Attention Group (SWSAG). On the one hand, CWSAG combines efficient channel attention with shifted window attention, enhancing non-local feature fusion, and producing more visually appealing results. On the other hand, SWSAG leverages spatial attention to capture rich structured feature information, thereby enabling SAAT to more effectively extract structural features.Extensive experimental results and ablation studies demonstrate the effectiveness of SAAT in the field of super-resolution. SAAT achieves performance comparable to that of the state-of-the-art (SOTA) under the same quantity of parameters.

artificial intelligence, information, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2506.0374

Country: Europe > Spain (0.28)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Panda: A pretrained forecast model for universal representation of chaotic dynamics

Lai, Jeffrey, Bao, Anthony, Gilpin, William

arXiv.org Machine LearningMay-21-2025

Chaotic systems are intrinsically sensitive to small errors, challenging efforts to construct predictive data-driven models of real-world dynamical systems such as fluid flows or neuronal activity. Prior efforts comprise either specialized models trained separately on individual time series, or foundation models trained on vast time series databases with little underlying dynamical structure. Motivated by dynamical systems theory, we present Panda, Patched Attention for Nonlinear DynAmics. We train Panda on a novel synthetic, extensible dataset of $2 \times 10^4$ chaotic dynamical systems that we discover using an evolutionary algorithm. Trained purely on simulated data, Panda exhibits emergent properties: zero-shot forecasting of unseen real world chaotic systems, and nonlinear resonance patterns in cross-channel attention heads. Despite having been trained only on low-dimensional ordinary differential equations, Panda spontaneously develops the ability to predict partial differential equations without retraining. We demonstrate a neural scaling law for differential equations, underscoring the potential of pretrained models for probing abstract mathematical domains like nonlinear dynamics.

evolutionary algorithm, large language model, machine learning, (19 more...)

arXiv.org Machine Learning

2505.13755

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Asia > India > Tripura (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.54)
(2 more...)

Add feedback

Filters

Collaborating Authors

channel attention

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

103303dd56a731e377d01f6a37badae3-Paper.pdf

Shallow Neural Networks Learn Low-Degree Spherical Polynomials with Learnable Channel Attention

SDE-Attention: Latent Attention in SDE-RNNs for Irregularly Sampled Time Series with Missing Data

EGSA-PT:Edge-Guided Spatial Attention with Progressive Training for Monocular Depth Estimation and Segmentation of Transparent Objects

Auto Learning Attention Benteng Ma

WACA-UNet: Weakness-Aware Channel Attention for Static IR Drop Prediction in Integrated Circuit Design

A Transformer-Based Conditional GAN with Multiple Instance Learning for UAV Signal Detection and Classification

A Comparative Study of NAFNet Baselines for Image Restoration

SAAT: Synergistic Alternating Aggregation Transformer for Image Super-Resolution

Panda: A pretrained forecast model for universal representation of chaotic dynamics